China AI Index — China's AI Ecosystem Tracker

01.AI

🟣 Private 📍 Beijing 📅 Founded 2023 👥 ~200

01.AI was founded in March 2023 by Kai-Fu Lee — former president of Google China and founder of Sinovation Ventures — and achieved unicorn status in just eight months. The company built the open-source Yi model series, becoming one of China's most prominent open-source AI contributors. In early 2025, 01.AI stopped pre-training large language models entirely, pivoting to build commercial applications on top of DeepSeek's models. Its global B2C product PopAI and enterprise AI solutions are now the company's primary revenue drivers. Most of the LLM training team transitioned to a joint lab with Alibaba Cloud in January 2025.

↗ Official Website ↗ API Platform ↗ HuggingFace ↗ GitHub

Model Portfolio

Yi-Lightning Text Closed

MoE flagship API model (Oct 2024). Ranked #6 on Chatbot Arena at launch — joint 3rd among LLM companies. 40% faster inference than prior Yi models. Final model before 01.AI halted pre-training in early 2025.

Context16K

Input $0.14/1M tokens

Output $0.14/1M tokens

View Docs ↗

Yi-Coder Code Open

Code model (Sep 2024); 1.5B and 9B variants; supports 52 programming languages; 128K context window.

Context128K

Input

Output

View Docs ↗

Yi-1.5 Text Open

Open-source series (May 2024); 6B–34B variants. Improved coding, math, reasoning, and instruction-following over original Yi.

Context—

Input

Output

View Docs ↗

Yi-VL-34B Multimodal Open

34B vision-language model (early 2024). Open-weight multimodal extension of Yi-34B.

Context—

Input

Output

View Docs ↗

Yi-34B Text Open

Founding open-source model (Nov 2023). Topped HuggingFace Open LLM Leaderboard and C-Eval at launch. 200K-token context variant (Yi-34B-200K) also available.

MMLU76.3/100

Context200K

Input

Output

View Docs ↗

Financing

Data compiled from public sources and fact-checked by Tony Peng with the assistance of AI. Round sizes and valuations may be approximate.

Total Raised

$300M+

All rounds

Valuation

$1B

Post-money (approx.)

Last Round

2023

Most recent year

Round Timeline

2023-11

Series A$300M

TencentXiaomiAlibaba CloudSinovation Ventures

2024-08

StrategicUndisclosed

Southeast Asian consortium

↗ Source: Crunchbase

Products & Applications

PopAI ↗

AI-powered document workspace targeting global markets. Features chat-with-document, AI presentation generation, and visual understanding. Supports 200+ languages. Primary B2C revenue driver for 01.AI following the company's 2025 pivot away from LLM pre-training.

ChatDocumentsGlobalConsumer

Company Newsfeed

📭

No news yet

News items about 01.AI will appear here.

Key People

Kai-Fu Lee

李开复

Founder & Chairman

Company ↕	Model ↕	Type ↕	Released ▼	Context ↕	Input Price ↕	Output Price ↕	Highlights
Zhipu AI (Z.ai)	GLM-5.2 Open	Text	2026-06	1M	$1.40/1M tokens	$4.40/1M tokens	753B params; Rolled out to all GLM Coding Plan tiers (Lite/Pro/Max/Team); standalone API + MIT open weights
MiniMax	MiniMax M3 Open	Multimodal	2026-06	1M	$0.30/1M tokens	$1.20/1M tokens	428B params (23B active); 1-million-token context window; novel attention mechanism called MSA; native multimodality
Moonshot AI	Kimi-K2.7-Code Open	Code	2026-06	256K	$0.95/1M tokens	$4.00/1M tokens	1T total params / 32B active, 384 experts, ~30% lower reasoning-token usage vs K2.6
StepFun	Step 3.7 Flash Open	Multimodal	2026-06	256K	$0.2/1M tokens	$1.15/1M tokens	198B total / 11B active params; 1.8B ViT vision encoder; 3 reasoning levels (low/med/high); ~400 TPS; Apache 2.0; supports NVIDIA NIM on-prem deployment
Meituan	LongCat-2.0 Open	Text	2026-06	1M			1.6T para + MoE; trained on 50,000 domestic Chinese chips; next-gen flagship
Alibaba	Qwen3.7-Max Closed	Text	2026-05	1M	$2.50/1M tokens	$7.50/1M tokens	Flagship closed-source reasoning model. Ranked #1 on Artificial Analysis Intelligence Index (57/100) out of 218 models at release. 1M token context window. Text-only (no multimodal). Architecture reportedly dual-72B. Extended thinking / chain-of-thought mode. Released May 19, 2026.
Baidu	ERNIE 5.1 Closed	Text	2026-05	128K	$0.59/1M tokens	$2.65/1M tokens	MoE successor to ERNIE 5.0. ~⅓ total params, ~½ active params, 6% of pre-training cost. AIME 2026: 99.6% with tool use (#2 globally). Arena Search: 1,223 (#4 global, #1 China). Released May 9, 2026.
Baichuan AI	Baichuan-M4 Closed	Text	2026-05	—			Medical AI flagship (May 2026). World #1 on HealthBench, HealthBench Hard & HealthBench Professional. Hallucination rate 3.3% — industry low via factuality-aware RL. Surpasses GPT-5.5, Opus 4.7, DeepSeek-V4-Pro. Powers 百小医 AI family doctor.
ModelBest	MiniCPM-V 4.6 Open	Multimodal	2026-05	260K			1.3B multimodal model (SigLip2 + Qwen3.5-0.8B). 260K token context. Runs on consumer phones (iOS/Android/HarmonyOS). Launched May 2026.
ModelBest	MiniCPM5-1B Open	Text	2026-05	128K			1B language model with 128K token context. SOTA on-device LLM at its size class. Apache 2.0.
Zhipu AI (Z.ai)	GLM-5.1 Open	Text	2026-04	200K	$0.98/1M tokens	$3.08/1M tokens	754B flagship — top scores on AIME 2026 (95.3) and GPQA-Diamond (86.2); strong agentic and coding ability
MiniMax	MiniMax-M2.7 Open	Agent	2026-04	1M tokens	$0.28/1M tokens	$1.20/1M tokens	229B MoE; self-evolving agent — 30% perf gain over 100+ rounds; agent teams, complex skills, 24/7 background agents
Moonshot AI	Kimi K2.6 Open	Multimodal	2026-04	256K	$0.60/1M tokens	$2.50/1M tokens	1T MoE (32B active), 256K ctx; native multimodal agentic — top SWE-Bench (80.2) and AIME 2026 (96.4)
DeepSeek	DeepSeek-V4-Pro Open	Text	2026-04	1M	$0.435/1M tokens	$0.87/1M tokens	Flagship MoE: 1.6T total / 49B active params, 1M context, FP4+FP8 precision, built-in Think/Non-Think reasoning modes. Codeforces rating 3206.
DeepSeek	DeepSeek-V4-Flash Open	Text	2026-04	1M	$0.14/1M tokens	$0.28/1M tokens	Efficient MoE: 284B total / 13B active params, 1M context. Same architecture as V4-Pro at far lower compute cost. Codeforces 2816.
Baidu	ERNIE-Image Open	Image Gen	2026-04	—	$0.03/image		8B single-stream Diffusion Transformer (DiT). Best text rendering in open source (LongTextBench: 0.9733). Up to 4K output. Apache 2.0. Released Apr 15, 2026.
Ant Group	Ring-2.6-1T Open	Reasoning	2026-04	262K	$0.075/1M tokens	$0.625/1M tokens	1T total params, ~63B active per token; adaptive reasoning via high / xhigh modes; up to 66K output tokens; open weights (MIT); released May 8 2026; free tier available on OpenRouter
Ant Group	Ling-2.6-1T Open	Text	2026-04	262K	$0.075/1M tokens	$0.625/1M tokens	1T total params, ~63B active per token; fast thinking approach cuts token cost to ~1/4 of comparable models; open weights (MIT); released Apr 23 2026; free tier available on OpenRouter via Novita AI
Ant Group	Ling-2.6-flash Open	Text	2026-04	262K	$0.10/1M tokens	$0.30/1M tokens	Efficient sparse MoE: 104B total / 7.4B active, hybrid linear attention. 340 tokens/s on 4× H20. Strong agent and tool-use performance.
Tencent	Hy3-preview Closed	Text	2026-04	256K	¥1.2/1M tokens	¥4.0/1M tokens	295B MoE (21B active); reasoning, coding, and agentic workloads. Via Tencent Cloud TokenHub. Released Apr 23, 2026.
Xiaomi	MiMo-V2.5-Pro Open	Text	2026-04	1M	$0.0036/1M tokens	$0.087/1M tokens	1.02T MoE, 42B active; KV-cache reduced ~7×; latest generation
Xiaomi	MiMo-V2-Pro Closed	Text	2026-03	1M	$0.0036/1M tokens	$0.087/1M tokens	1T+ MoE, 42B active; hybrid attention; 1M token context
Unitree Robotics	UnifoLM-VLA-0 Open	Embodied	2026-03	N/A			Vision-Language-Action model enabling the G1 humanoid to autonomously perform household tasks from natural language commands. Runs onboard the robot. Open-sourced March 2026 — Unitree's first AI model release.
Kunlun Tech	SkyReels V4 Closed	Video Gen	2026-03	—			Audio-visual creation model (Mar 2026). Dual-stream architecture; #1 globally on Text-to-Video (With Audio) and Image-to-Video (With Audio) tracks at release. Generates clips up to 3 minutes.
Kunlun Tech	Mureka V9 Closed	Audio	2026-03	—			Music generation model (Mar 2026). Paragraph-level text control; enhanced mixing quality, vocal expression, and style richness.
Kunlun Tech	Matrix-Game 3.0 Closed	Text	2026-03	—			Physics-simulation interactive world model (Mar 2026). 5B params; 720P @ 40FPS; covers 1,000+ scenarios with Unreal Engine data. Industrial-grade real-time interactivity.
ByteDance	Seed2.0 Pro Closed	Multimodal	2026-02	272K	$0.47/1M tokens	$2.37/1M tokens	ByteDance's flagship multimodal model. Understands text, image, and video. Ranks #3 globally on LMSYS Vision Arena and #6 on overall text arena. AIME 2025: 98.3, SWE-bench: 76.5%.
ByteDance	Seedream 5.0 Lite Closed	Image Gen	2026-02	—	$0.026/image		Unified multimodal image generation with chain-of-thought visual reasoning, real-time web search, and native editing. Supports up to 14 reference images. Up to 4K output at 2048×2048.
ByteDance	Seedance 2.0 Closed	Video Gen	2026-02	—			Unified audio-video joint generation model. Accepts text, image, video, and audio inputs; outputs native 2K video (up to 15s) with synchronized audio. 30% faster than Seedance 1.5 Pro.
Alibaba	Qwen3.5 Open	Multimodal	2026-02	262K (1M w/ YaRN)			Unified vision-language MoE family. Flagship: 35B-A3B (35B total / 3B active) and 397B-A17B. Thinking mode on by default. Supports image, video, text input. 201 languages. Apache 2.0.
Alibaba	Qwen3-Coder-Next Open	Code	2026-02	256K			80B total / 3B active MoE coding agent. Excels at long-horizon agentic tasks, tool use, and IDE integration (Claude Code, Cline, Qwen Code). No thinking mode.
Zhipu AI (Z.ai)	GLM-5 Open	Text	2026-02	200K	$1.40/1M tokens	$4.40/1M tokens	754B MoE (40B active); 28.5T token pre-training; top-tier SWE-bench score of 77.8
MiniMax	MiniMax-M2.5 Open	Agent	2026-02	1M tokens	$0.30–$1.00/hr	$0.30–$1.00/hr	229B; SOTA SWE-Bench (80.2); 80% of MiniMax's own code generated by this model; M2.5-Lightning at 100 tok/s
Baidu	ERNIE 5.0 Closed	Multimodal	2026-02	128K			2.4 trillion parameter unified multimodal (text + image + video + audio) in a single autoregressive framework. LMArena Text: 1,460 (#1 China, #8 global); Vision: 1,226 (#1 China, #8 global). Released Feb 6, 2026.
Ant Group	LingBot-VLA Open	Embodied	2026-02				Open-source VLA foundation model trained on ~20,000 hours of real-world dual-arm robot data across 9 embodiments
Ant Group	Ring-2.5-1T Open	Text	2026-02	256K			World's first open-source 1T-parameter thinking model. Gold medal level at IMO 2025 (35/42 pts) and CMO 2025 (105/126). 3× throughput for sequences >32K.
Ant Group	Ming-flash-omni-2.0 Open	Multimodal	2026-02	—			Any-to-any omni model: accepts image, text, video, and audio; outputs image, text, and audio. 100B total / 6B active MoE. Supports zero-shot voice cloning and image generation/editing.
Ant Group	LLaDA-2.1-flash Open	Multimodal	2026-02	—			Novel diffusion-based language model — not autoregressive. 103B params. Generates text by iterative token editing rather than left-to-right decoding. 102K monthly downloads.
Baichuan AI	Baichuan-M3 Open	Text	2026-02	—			235B medical model (Feb 2026) built on Qwen3 architecture. Former world #1 on HealthBench. Hallucination rate 3.5%. Outperforms human doctors in diagnostic accuracy. 48GB VRAM with W4 quantization.
Moonshot AI	Kimi K2.5 Open	Multimodal	2026-01	256K	$0.40/1M tokens	$1.90/1M tokens	1T MoE (32B active); trained on 15T vision+text tokens; thinking & instant modes; agent swarm support
DeepSeek	DeepSeek-OCR-2 Open	Multimodal	2026-01	—			3B visual OCR model with document-to-markdown conversion, layout-aware grounding, and dynamic resolution. 1.66M monthly downloads.
Tencent	HunyuanImage 3.0 Open	Image Gen	2026-01	—			80B MoE (13B active); text-to-image and image-to-image with reasoning. Open weight on HuggingFace. Released Jan 26, 2026.
Meituan	LongCat-Flash-Lite Open	Text	2026-01	256K			68.5B MoE, 2.9–4.5B active; fast and efficient
AIsphere	PixVerse R1 Closed	Video Gen	2026-01	N/A			Real-time world model; 1080p, <15s latency, physics-aware, infinite temporal continuity
SenseTime	SenseNova V6.5 Omni Closed	Multimodal	2026-01	—			Real-time multimodal streaming model. Powers SenseChat's live audio/video interaction. Successor to V6 Pro.
DeepSeek	DeepSeek-V3.2 Open	Text	2025-12	128K	$0.252/1M tokens	$0.378/1M tokens	685B MoE with DeepSeek Sparse Attention (DSA). V3.2-Speciale variant won gold at 2025 IMO and IOI. 4.16M monthly downloads on HuggingFace.
Kuaishou	Kling I2V 2.0 Closed	Multimodal	2025-12	N/A	$0.16/video (5s)		Image-to-video with industry-leading subject preservation
Xiaomi	MiMo-V2-Flash Open	Text	2025-12	N/A	$0.01/1M tokens	$0.30/1M tokens	309B MoE, 15B active; trained on 27T tokens; open-source release
AIsphere	PixVerse V5.5 Closed	Video Gen	2025-12	N/A			Text/image-to-video; HD output, multiple aspect ratios, character consistency
Tencent	Hunyuan3D 3.0 Closed	3D Gen	2025-11	—			3D asset generation from text, image, or sketch input
iFlyTek	Spark X1.5 Closed	Text	2025-11	—			MoE reasoning model (29.3B total / 3B active parameters). Supports 130+ languages. Runs on a single Huawei Ascend server. Launched Nov 2025.
MiniMax	MiniMax-M2 Open	Agent	2025-10	1M tokens			230B MoE (10B active); interleaved thinking; open-source under Modified MIT; free API available
Infinigence AI	Megrez2-3x7B-A3B Open	Text	2025-09	—			MoE edge model: 3×7B experts, 3B active parameters. Open-weight, Apache 2.0.
Alibaba	Qwen-Image Open	Image Gen	2025-08	—			Image generation and editing foundation model. Exceptional Chinese + English text rendering. Supports style transfer, object insertion/removal, layered editing, and depth/edge estimation. Released Aug 2025.
ModelBest	MiniCPM-V 4.5 Open	Multimodal	2025-08	—			8B multimodal model (Qwen3-8B + SigLIP2). Apache 2.0.
Moonshot AI	Kimi k2 Open	Text	2025-07	128K	¥0.12/1K tokens	¥0.12/1K tokens	Agentic model with tool use, web browsing, and code execution
MiniMax	MiniMax-M1 Open	Text	2025-06	1M tokens	$0.80/1M tokens	$2.20/1M tokens	First open-weight large-scale hybrid-attention reasoning model; test-time compute scaling
MiniMax	Hailuo-02 Open	Video Gen	2025-06	N/A	$0.06/video (5s)		Physics-aware video generation; top-3 globally on VBench
Shengshu Technology	Vidu Q3 Closed	Video Gen	2025-06	—	~$0.07/sec		World's first storytelling-focused video model. Up to 16 sec, 1080p 24fps, native audio sync. Multilingual. Pro and Turbo variants. MCP integration.
DeepSeek	DeepSeek-R1-0528 Open	Text	2025-05	128K	$0.50/1M tokens	$2.15/1M tokens	Latest R1 update: 685B, adds system prompt support, deeper reasoning (23K avg tokens on AIME vs 12K prior). Distilled 8B version achieves 86% on AIME 2024.
Kuaishou	Kling 2.0 Closed	Video Gen	2025-04	N/A	$0.16/video (5s)		4K-capable video generation; advanced camera controls and scene coherence
Kuaishou	Kolors 2.0 Open	Image Gen	2025-04	N/A	$0.004/image		Upgraded photorealistic image model; open-source available
SenseTime	SenseNova V6 Pro Closed	Multimodal	2025-04	256K	¥2.8/1M tokens	¥8.4/1M tokens	620B MoE hybrid. Real-time audio/video streaming. Ranked #1 in China in multimodal reasoning at launch. Lowest reasoning cost in industry at launch (Apr 2025).
Kunlun Tech	Skywork-OR1-32B Open	Text	2025-04	—			Open-source math and coding reasoning model (Apr 2025). 32B params; rivals DeepSeek-R1 on competitive programming benchmarks. 7B variant also available.
Baidu	ERNIE X1 Closed	Text	2025-03	128K	$0.28/1M tokens	$1.10/1M tokens	Baidu's first dedicated reasoning model with extended chain-of-thought
Baidu	ERNIE 4.5 Closed	Multimodal	2025-03	128K	$0.55/1M tokens	$2.20/1M tokens	Improved multimodal and reasoning; best ERNIE for Chinese enterprise tasks. The flagship ERNIE 4.5 model is closed-source, though several variants in the ERNIE 4.5 family are openly available on HuggingFace.
StepFun	Step-3 Closed	Text	2025-01	1M	$0.57/1M tokens	$1.42/1M tokens	StepFun's 2025 flagship; 1M token context and native reasoning mode
Baichuan AI	Baichuan-M2 Open	Text	2025-01	—			32B medical reasoning model (2025) built on Qwen2.5-32B with innovative Large Verifier System for real-world clinical reasoning.
ModelBest	MiniCPM-o 2.6 Open	Multimodal	2025-01	—			8B any-to-any model (text + vision + speech). Surpasses GPT-4o and Gemini 1.5 Pro on single-image understanding benchmarks. Launched Jan 2025.
Kuaishou	Kling 1.6 Pro Closed	Video Gen	2024-12	N/A	$0.14/video (5s)		Previous flagship; supports 1080p, 5-10s clips, camera controls
Infinigence AI	Megrez-3B-Omni Open	Multimodal	2024-12	—			3B omni model processing text, vision, and audio. Outperforms LLaVA-NeXT-Yi-34B on vision benchmarks. Optimised for on-device and edge deployment.
Kunlun Tech	Skywork-o1 Open	Text	2024-11	—			Reasoning model (Nov 2024) — among China's first o1-style models with chain-of-thought Chinese logical reasoning. Open-source 8B variant (Llama 3.1 base) plus proprietary advanced version.
Baichuan AI	Baichuan4-Turbo Closed	Text	2024-10	192K			Enterprise flagship general LLM (late 2024). 10%+ usability gain vs prior generation; priced at ~80% of GPT-4o. Supports multimodal input.
Baichuan AI	Baichuan4-Air Closed	Text	2024-10	192K			MoE variant (PRI architecture) of Baichuan4. High performance at low cost for API deployments.
01.AI	Yi-Lightning Closed	Text	2024-10	16K	$0.14/1M tokens	$0.14/1M tokens	MoE flagship API model (Oct 2024). Ranked #6 on Chatbot Arena at launch — joint 3rd among LLM companies. 40% faster inference than prior Yi models. Final model before 01.AI halted pre-training in early 2025.
StepFun	Step-1V Closed	Multimodal	2024-09	200K	¥0.034/1K tokens	¥0.1/1K tokens	Vision-language model supporting image input
01.AI	Yi-Coder Open	Code	2024-09	128K			Code model (Sep 2024); 1.5B and 9B variants; supports 52 programming languages; 128K context window.
iFlyTek	Spark 4.0 Closed	Text	2024-08	—			Flagship LLM (Aug 2024). Claims comparable performance to GPT-4 Turbo on Chinese language benchmarks.
StepFun	Step-2 Closed	Text	2024-07	256K	¥0.038/1K tokens	¥0.12/1K tokens	Very long context window; strong document understanding
StepFun	Step-1X-Image Closed	Image Gen	2024-07	N/A	¥0.04/image		High-resolution image generation model
01.AI	Yi-1.5 Open	Text	2024-05	—			Open-source series (May 2024); 6B–34B variants. Improved coding, math, reasoning, and instruction-following over original Yi.
01.AI	Yi-VL-34B Open	Multimodal	2024-01	—			34B vision-language model (early 2024). Open-weight multimodal extension of Yi-34B.
01.AI	Yi-34B Open	Text	2023-11	200K			Founding open-source model (Nov 2023). Topped HuggingFace Open LLM Leaderboard and C-Eval at launch. 200K-token context variant (Yi-34B-200K) also available.
Kunlun Tech	Skywork-13B Open	Text	2023-10	—			Founding open-source bilingual LLM (Oct 2023). 13B params; pretrained on 3.2T tokens. Led same-scale models on CEVAL, CMMLU, and MMLU at launch.
Baichuan AI	Baichuan2 Open	Text	2023-09	4K			Open-source bilingual LLM series (Sep 2023); 7B and 13B variants; trained on 2.6T tokens. Available on Hugging Face under permissive license.
Baidu	ERNIE-Speed Closed	Text	—	128K			Free tier for prototyping and light production
Meituan	LongCat-Image Open	Image Gen	—	N/A			Image generation model, ~6B params; data-quality focused; released Dec 2025
Meituan	LongCat-Video Open	Video Gen	—	N/A			Text-to-video generation model; released Oct 2025
Meituan	LongCat-Flash-Omni Open	Multimodal	—	256K			Text + vision + audio multimodal
Meituan	LongCat-Flash-Thinking Open	Reasoning	—	256K			Chain-of-thought reasoning variant of LongCat-Flash
Meituan	LongCat-Flash-Chat Open	Text	—	128K			560B MoE, 27B active; open-source; 500K free tokens/day
WeRide	WeRide GENESIS Closed	Embodied	—	—			Generative Engineered Neural Environment for Simulated Intelligence in Self-driving. WeRide's proprietary general-purpose simulation platform combining physical AI with generative AI. Rapidly builds photorealistic simulated cities in minutes, generates diverse edge-case scenarios from billions of km of real-world data, and models realistic pedestrian and driver behavior — all at centimeter-level fidelity. Supports L2++ through L4 AV development and validation via four modules: AI Scenarios, AI Agents, AI Metrics, and AI Diagnosis.
Pony.ai	PonyWorld 2.0 Closed	Embodied	—	—			Second-generation world model underpinning the Virtual Driver L4 autonomous driving platform. Introduces an Intention layer — a structured representation of decision-making that enables the system to evaluate its own driving decisions, identify accuracy gaps across scenarios, and direct targeted data collection rather than broad undirected improvement. Direct sensor-to-action architecture with no language models in the inference pipeline; runs on 1016 TOPS across three NVIDIA DRIVE Orin-X SoCs with redundant failover.
Unisound	UniGPT (Shanhai 山海) Closed	Text	—	—			60B+ parameter general large model (v5.0); underpins all Unisound vertical products. Medical, enterprise, and consumer deployments.
Unisound	U2-ASR 2.5 Closed	Audio	—	—			First LLM-based semantic ASR model for Chinese. Covers 100+ dialects across 7 dialect systems. >90% accuracy. Available via Token Hub API.
Unisound	U2-TTS / U2-TTS-Clone Closed	Audio	—	—			Text-to-speech and voice cloning with full-duplex millisecond response. Available via Token Hub API.
Unisound	U1-OCR Closed	Multimodal	—	—			Industrial-grade document intelligence model for OCR and document understanding. Launched February 2026.

Company ↕

Model ↕

Type ↕

Released ▼

Context ↕

Input Price ↕

Output Price ↕

Highlights

Zhipu AI (Z.ai)

GLM-5.2 Open

Text

2026-06

$1.40/1M tokens

$4.40/1M tokens

753B params; Rolled out to all GLM Coding Plan tiers (Lite/Pro/Max/Team); standalone API + MIT open weights

MiniMax

MiniMax M3 Open

Multimodal

2026-06

$0.30/1M tokens

$1.20/1M tokens

428B params (23B active); 1-million-token context window; novel attention mechanism called MSA; native multimodality

Moonshot AI

Kimi-K2.7-Code Open

Code

2026-06

256K

$0.95/1M tokens

$4.00/1M tokens

1T total params / 32B active, 384 experts, ~30% lower reasoning-token usage vs K2.6

StepFun

Step 3.7 Flash Open

Multimodal

2026-06

256K

$0.2/1M tokens

$1.15/1M tokens

198B total / 11B active params; 1.8B ViT vision encoder; 3 reasoning levels (low/med/high); ~400 TPS; Apache 2.0; supports NVIDIA NIM on-prem deployment

Meituan

LongCat-2.0 Open

Text

2026-06

1.6T para + MoE; trained on 50,000 domestic Chinese chips; next-gen flagship

Alibaba

Qwen3.7-Max Closed

Text

2026-05

$2.50/1M tokens

$7.50/1M tokens

Flagship closed-source reasoning model. Ranked #1 on Artificial Analysis Intelligence Index (57/100) out of 218 models at release. 1M token context window. Text-only (no multimodal). Architecture reportedly dual-72B. Extended thinking / chain-of-thought mode. Released May 19, 2026.

Baidu

ERNIE 5.1 Closed

Text

2026-05

128K

$0.59/1M tokens

$2.65/1M tokens

MoE successor to ERNIE 5.0. ~⅓ total params, ~½ active params, 6% of pre-training cost. AIME 2026: 99.6% with tool use (#2 globally). Arena Search: 1,223 (#4 global, #1 China). Released May 9, 2026.

Baichuan AI

Baichuan-M4 Closed

Text

2026-05

—

Medical AI flagship (May 2026). World #1 on HealthBench, HealthBench Hard & HealthBench Professional. Hallucination rate 3.3% — industry low via factuality-aware RL. Surpasses GPT-5.5, Opus 4.7, DeepSeek-V4-Pro. Powers 百小医 AI family doctor.

ModelBest

MiniCPM-V 4.6 Open

Multimodal

2026-05

260K

1.3B multimodal model (SigLip2 + Qwen3.5-0.8B). 260K token context. Runs on consumer phones (iOS/Android/HarmonyOS). Launched May 2026.

ModelBest

MiniCPM5-1B Open

Text

2026-05

128K

1B language model with 128K token context. SOTA on-device LLM at its size class. Apache 2.0.

Zhipu AI (Z.ai)

GLM-5.1 Open

Text

2026-04

200K

$0.98/1M tokens

$3.08/1M tokens

754B flagship — top scores on AIME 2026 (95.3) and GPQA-Diamond (86.2); strong agentic and coding ability

MiniMax

MiniMax-M2.7 Open

Agent

2026-04

1M tokens

$0.28/1M tokens

$1.20/1M tokens

229B MoE; self-evolving agent — 30% perf gain over 100+ rounds; agent teams, complex skills, 24/7 background agents

Moonshot AI

Kimi K2.6 Open

Multimodal

2026-04

256K

$0.60/1M tokens

$2.50/1M tokens

1T MoE (32B active), 256K ctx; native multimodal agentic — top SWE-Bench (80.2) and AIME 2026 (96.4)

DeepSeek

DeepSeek-V4-Pro Open

Text

2026-04

$0.435/1M tokens

$0.87/1M tokens

Flagship MoE: 1.6T total / 49B active params, 1M context, FP4+FP8 precision, built-in Think/Non-Think reasoning modes. Codeforces rating 3206.

DeepSeek

DeepSeek-V4-Flash Open

Text

2026-04

$0.14/1M tokens

$0.28/1M tokens

Efficient MoE: 284B total / 13B active params, 1M context. Same architecture as V4-Pro at far lower compute cost. Codeforces 2816.

Baidu

ERNIE-Image Open

Image Gen

2026-04

—

$0.03/image

8B single-stream Diffusion Transformer (DiT). Best text rendering in open source (LongTextBench: 0.9733). Up to 4K output. Apache 2.0. Released Apr 15, 2026.

Ant Group

Ring-2.6-1T Open

Reasoning

2026-04

262K

$0.075/1M tokens

$0.625/1M tokens

1T total params, ~63B active per token; adaptive reasoning via high / xhigh modes; up to 66K output tokens; open weights (MIT); released May 8 2026; free tier available on OpenRouter

Ant Group

Ling-2.6-1T Open

Text

2026-04

262K

$0.075/1M tokens

$0.625/1M tokens

1T total params, ~63B active per token; fast thinking approach cuts token cost to ~1/4 of comparable models; open weights (MIT); released Apr 23 2026; free tier available on OpenRouter via Novita AI

Ant Group

Ling-2.6-flash Open

Text

2026-04

262K

$0.10/1M tokens

$0.30/1M tokens

Efficient sparse MoE: 104B total / 7.4B active, hybrid linear attention. 340 tokens/s on 4× H20. Strong agent and tool-use performance.

Tencent

Hy3-preview Closed

Text

2026-04

256K

¥1.2/1M tokens

¥4.0/1M tokens

295B MoE (21B active); reasoning, coding, and agentic workloads. Via Tencent Cloud TokenHub. Released Apr 23, 2026.

Xiaomi

MiMo-V2.5-Pro Open

Text

2026-04

$0.0036/1M tokens

$0.087/1M tokens

1.02T MoE, 42B active; KV-cache reduced ~7×; latest generation

Xiaomi

MiMo-V2-Pro Closed

Text

2026-03

$0.0036/1M tokens

$0.087/1M tokens

1T+ MoE, 42B active; hybrid attention; 1M token context

Unitree Robotics

UnifoLM-VLA-0 Open

Embodied

2026-03

N/A

Vision-Language-Action model enabling the G1 humanoid to autonomously perform household tasks from natural language commands. Runs onboard the robot. Open-sourced March 2026 — Unitree's first AI model release.

Kunlun Tech

SkyReels V4 Closed

Video Gen

2026-03

—

Audio-visual creation model (Mar 2026). Dual-stream architecture; #1 globally on Text-to-Video (With Audio) and Image-to-Video (With Audio) tracks at release. Generates clips up to 3 minutes.

Kunlun Tech

Mureka V9 Closed

Audio

2026-03

—

Music generation model (Mar 2026). Paragraph-level text control; enhanced mixing quality, vocal expression, and style richness.

Kunlun Tech

Matrix-Game 3.0 Closed

Text

2026-03

—

Physics-simulation interactive world model (Mar 2026). 5B params; 720P @ 40FPS; covers 1,000+ scenarios with Unreal Engine data. Industrial-grade real-time interactivity.

ByteDance

Seed2.0 Pro Closed

Multimodal

2026-02

272K

$0.47/1M tokens

$2.37/1M tokens

ByteDance's flagship multimodal model. Understands text, image, and video. Ranks #3 globally on LMSYS Vision Arena and #6 on overall text arena. AIME 2025: 98.3, SWE-bench: 76.5%.

ByteDance

Seedream 5.0 Lite Closed

Image Gen

2026-02

—

$0.026/image

Unified multimodal image generation with chain-of-thought visual reasoning, real-time web search, and native editing. Supports up to 14 reference images. Up to 4K output at 2048×2048.

ByteDance

Seedance 2.0 Closed

Video Gen

2026-02

—

Unified audio-video joint generation model. Accepts text, image, video, and audio inputs; outputs native 2K video (up to 15s) with synchronized audio. 30% faster than Seedance 1.5 Pro.

Alibaba

Qwen3.5 Open

Multimodal

2026-02

262K (1M w/ YaRN)

Unified vision-language MoE family. Flagship: 35B-A3B (35B total / 3B active) and 397B-A17B. Thinking mode on by default. Supports image, video, text input. 201 languages. Apache 2.0.

Alibaba

Qwen3-Coder-Next Open

Code

2026-02

256K

80B total / 3B active MoE coding agent. Excels at long-horizon agentic tasks, tool use, and IDE integration (Claude Code, Cline, Qwen Code). No thinking mode.

Zhipu AI (Z.ai)

GLM-5 Open

Text

2026-02

200K

$1.40/1M tokens

$4.40/1M tokens

754B MoE (40B active); 28.5T token pre-training; top-tier SWE-bench score of 77.8

MiniMax

MiniMax-M2.5 Open

Agent

2026-02

1M tokens

$0.30–$1.00/hr

229B; SOTA SWE-Bench (80.2); 80% of MiniMax's own code generated by this model; M2.5-Lightning at 100 tok/s

Baidu

ERNIE 5.0 Closed

Multimodal

2026-02

128K

2.4 trillion parameter unified multimodal (text + image + video + audio) in a single autoregressive framework. LMArena Text: 1,460 (#1 China, #8 global); Vision: 1,226 (#1 China, #8 global). Released Feb 6, 2026.

Ant Group

LingBot-VLA Open

Embodied

2026-02

Open-source VLA foundation model trained on ~20,000 hours of real-world dual-arm robot data across 9 embodiments

Ant Group

Ring-2.5-1T Open

Text

2026-02

256K

World's first open-source 1T-parameter thinking model. Gold medal level at IMO 2025 (35/42 pts) and CMO 2025 (105/126). 3× throughput for sequences >32K.

Ant Group

Ming-flash-omni-2.0 Open

Multimodal

2026-02

—

Any-to-any omni model: accepts image, text, video, and audio; outputs image, text, and audio. 100B total / 6B active MoE. Supports zero-shot voice cloning and image generation/editing.

Ant Group

LLaDA-2.1-flash Open

Multimodal

2026-02

—

Novel diffusion-based language model — not autoregressive. 103B params. Generates text by iterative token editing rather than left-to-right decoding. 102K monthly downloads.

Baichuan AI

Baichuan-M3 Open

Text

2026-02

—

235B medical model (Feb 2026) built on Qwen3 architecture. Former world #1 on HealthBench. Hallucination rate 3.5%. Outperforms human doctors in diagnostic accuracy. 48GB VRAM with W4 quantization.

Moonshot AI

Kimi K2.5 Open

Multimodal

2026-01

256K

$0.40/1M tokens

$1.90/1M tokens

1T MoE (32B active); trained on 15T vision+text tokens; thinking & instant modes; agent swarm support

DeepSeek

DeepSeek-OCR-2 Open

Multimodal

2026-01

—

3B visual OCR model with document-to-markdown conversion, layout-aware grounding, and dynamic resolution. 1.66M monthly downloads.

Tencent

HunyuanImage 3.0 Open

Image Gen

2026-01

—

80B MoE (13B active); text-to-image and image-to-image with reasoning. Open weight on HuggingFace. Released Jan 26, 2026.

Meituan

LongCat-Flash-Lite Open

Text

2026-01

256K

68.5B MoE, 2.9–4.5B active; fast and efficient

AIsphere

PixVerse R1 Closed

Video Gen

2026-01

N/A

Real-time world model; 1080p, <15s latency, physics-aware, infinite temporal continuity

SenseTime

SenseNova V6.5 Omni Closed

Multimodal

2026-01

—

Real-time multimodal streaming model. Powers SenseChat's live audio/video interaction. Successor to V6 Pro.

DeepSeek

DeepSeek-V3.2 Open

Text

2025-12

128K

$0.252/1M tokens

$0.378/1M tokens

685B MoE with DeepSeek Sparse Attention (DSA). V3.2-Speciale variant won gold at 2025 IMO and IOI. 4.16M monthly downloads on HuggingFace.

Kuaishou

Kling I2V 2.0 Closed

Multimodal

2025-12

N/A

$0.16/video (5s)

Image-to-video with industry-leading subject preservation

Xiaomi

MiMo-V2-Flash Open

Text

2025-12

N/A

$0.01/1M tokens

$0.30/1M tokens

309B MoE, 15B active; trained on 27T tokens; open-source release

AIsphere

PixVerse V5.5 Closed

Video Gen

2025-12

N/A

Text/image-to-video; HD output, multiple aspect ratios, character consistency

Tencent

Hunyuan3D 3.0 Closed

3D Gen

2025-11

—

3D asset generation from text, image, or sketch input

iFlyTek

Spark X1.5 Closed

Text

2025-11

—

MoE reasoning model (29.3B total / 3B active parameters). Supports 130+ languages. Runs on a single Huawei Ascend server. Launched Nov 2025.

MiniMax

MiniMax-M2 Open

Agent

2025-10

1M tokens

230B MoE (10B active); interleaved thinking; open-source under Modified MIT; free API available

Infinigence AI

Megrez2-3x7B-A3B Open

Text

2025-09

—

MoE edge model: 3×7B experts, 3B active parameters. Open-weight, Apache 2.0.

Alibaba

Qwen-Image Open

Image Gen

2025-08

—

Image generation and editing foundation model. Exceptional Chinese + English text rendering. Supports style transfer, object insertion/removal, layered editing, and depth/edge estimation. Released Aug 2025.

ModelBest

MiniCPM-V 4.5 Open

Multimodal

2025-08

—

8B multimodal model (Qwen3-8B + SigLIP2). Apache 2.0.

Moonshot AI

Kimi k2 Open

Text

2025-07

128K

¥0.12/1K tokens

Agentic model with tool use, web browsing, and code execution

MiniMax

MiniMax-M1 Open

Text

2025-06

1M tokens

$0.80/1M tokens

$2.20/1M tokens

First open-weight large-scale hybrid-attention reasoning model; test-time compute scaling

MiniMax

Hailuo-02 Open

Video Gen

2025-06

N/A

$0.06/video (5s)

Physics-aware video generation; top-3 globally on VBench

Shengshu Technology

Vidu Q3 Closed

Video Gen

2025-06

—

~$0.07/sec

World's first storytelling-focused video model. Up to 16 sec, 1080p 24fps, native audio sync. Multilingual. Pro and Turbo variants. MCP integration.

DeepSeek

DeepSeek-R1-0528 Open

Text

2025-05

128K

$0.50/1M tokens

$2.15/1M tokens

Latest R1 update: 685B, adds system prompt support, deeper reasoning (23K avg tokens on AIME vs 12K prior). Distilled 8B version achieves 86% on AIME 2024.

Kuaishou

Kling 2.0 Closed

Video Gen

2025-04

N/A

$0.16/video (5s)

4K-capable video generation; advanced camera controls and scene coherence

Kuaishou

Kolors 2.0 Open

Image Gen

2025-04

N/A

$0.004/image

Upgraded photorealistic image model; open-source available

SenseTime

SenseNova V6 Pro Closed

Multimodal

2025-04

256K

¥2.8/1M tokens

¥8.4/1M tokens

620B MoE hybrid. Real-time audio/video streaming. Ranked #1 in China in multimodal reasoning at launch. Lowest reasoning cost in industry at launch (Apr 2025).

Kunlun Tech

Skywork-OR1-32B Open

Text

2025-04

—

Open-source math and coding reasoning model (Apr 2025). 32B params; rivals DeepSeek-R1 on competitive programming benchmarks. 7B variant also available.

Baidu

ERNIE X1 Closed

Text

2025-03

128K

$0.28/1M tokens

$1.10/1M tokens

Baidu's first dedicated reasoning model with extended chain-of-thought

Baidu

ERNIE 4.5 Closed

Multimodal

2025-03

128K

$0.55/1M tokens

$2.20/1M tokens

Improved multimodal and reasoning; best ERNIE for Chinese enterprise tasks. The flagship ERNIE 4.5 model is closed-source, though several variants in the ERNIE 4.5 family are openly available on HuggingFace.

StepFun

Step-3 Closed

Text

2025-01

$0.57/1M tokens

$1.42/1M tokens

StepFun's 2025 flagship; 1M token context and native reasoning mode

Baichuan AI

Baichuan-M2 Open

Text

2025-01

—

32B medical reasoning model (2025) built on Qwen2.5-32B with innovative Large Verifier System for real-world clinical reasoning.

ModelBest

MiniCPM-o 2.6 Open

Multimodal

2025-01

—

8B any-to-any model (text + vision + speech). Surpasses GPT-4o and Gemini 1.5 Pro on single-image understanding benchmarks. Launched Jan 2025.

Kuaishou

Kling 1.6 Pro Closed

Video Gen

2024-12

N/A

$0.14/video (5s)

Previous flagship; supports 1080p, 5-10s clips, camera controls

Infinigence AI

Megrez-3B-Omni Open

Multimodal

2024-12

—

3B omni model processing text, vision, and audio. Outperforms LLaVA-NeXT-Yi-34B on vision benchmarks. Optimised for on-device and edge deployment.

Kunlun Tech

Skywork-o1 Open

Text

2024-11

—

Reasoning model (Nov 2024) — among China's first o1-style models with chain-of-thought Chinese logical reasoning. Open-source 8B variant (Llama 3.1 base) plus proprietary advanced version.

Baichuan AI

Baichuan4-Turbo Closed

Text

2024-10

192K

Enterprise flagship general LLM (late 2024). 10%+ usability gain vs prior generation; priced at ~80% of GPT-4o. Supports multimodal input.

Baichuan AI

Baichuan4-Air Closed

Text

2024-10

192K

MoE variant (PRI architecture) of Baichuan4. High performance at low cost for API deployments.

01.AI

Yi-Lightning Closed

Text

2024-10

16K

$0.14/1M tokens

StepFun

Step-1V Closed

Multimodal

2024-09

200K

¥0.034/1K tokens

¥0.1/1K tokens

Vision-language model supporting image input

01.AI

Yi-Coder Open

Code

2024-09

128K

Code model (Sep 2024); 1.5B and 9B variants; supports 52 programming languages; 128K context window.

iFlyTek

Spark 4.0 Closed

Text

2024-08

—

Flagship LLM (Aug 2024). Claims comparable performance to GPT-4 Turbo on Chinese language benchmarks.

StepFun

Step-2 Closed

Text

2024-07

256K

¥0.038/1K tokens

¥0.12/1K tokens

Very long context window; strong document understanding

StepFun

Step-1X-Image Closed

Image Gen

2024-07

N/A

¥0.04/image

High-resolution image generation model

01.AI

Yi-1.5 Open

Text

2024-05

—

Open-source series (May 2024); 6B–34B variants. Improved coding, math, reasoning, and instruction-following over original Yi.

01.AI

Yi-VL-34B Open

Multimodal

2024-01

—

34B vision-language model (early 2024). Open-weight multimodal extension of Yi-34B.

01.AI

Yi-34B Open

Text

2023-11

200K

Founding open-source model (Nov 2023). Topped HuggingFace Open LLM Leaderboard and C-Eval at launch. 200K-token context variant (Yi-34B-200K) also available.

Kunlun Tech

Skywork-13B Open

Text

2023-10

—

Founding open-source bilingual LLM (Oct 2023). 13B params; pretrained on 3.2T tokens. Led same-scale models on CEVAL, CMMLU, and MMLU at launch.

Baichuan AI

Baichuan2 Open

Text

2023-09

Open-source bilingual LLM series (Sep 2023); 7B and 13B variants; trained on 2.6T tokens. Available on Hugging Face under permissive license.

Baidu

ERNIE-Speed Closed

Text

—

128K

Free tier for prototyping and light production

Meituan

LongCat-Image Open

Image Gen

—

N/A

Image generation model, ~6B params; data-quality focused; released Dec 2025

Meituan

LongCat-Video Open

Video Gen

—

N/A

Text-to-video generation model; released Oct 2025

Meituan

LongCat-Flash-Omni Open

Multimodal

—

256K

Text + vision + audio multimodal

Meituan

LongCat-Flash-Thinking Open

Reasoning

—

256K

Chain-of-thought reasoning variant of LongCat-Flash

Meituan

LongCat-Flash-Chat Open

Text

—

128K

560B MoE, 27B active; open-source; 500K free tokens/day

WeRide

WeRide GENESIS Closed

Embodied

—

Generative Engineered Neural Environment for Simulated Intelligence in Self-driving. WeRide's proprietary general-purpose simulation platform combining physical AI with generative AI. Rapidly builds photorealistic simulated cities in minutes, generates diverse edge-case scenarios from billions of km of real-world data, and models realistic pedestrian and driver behavior — all at centimeter-level fidelity. Supports L2++ through L4 AV development and validation via four modules: AI Scenarios, AI Agents, AI Metrics, and AI Diagnosis.

Pony.ai

PonyWorld 2.0 Closed

Embodied

—

Second-generation world model underpinning the Virtual Driver L4 autonomous driving platform. Introduces an Intention layer — a structured representation of decision-making that enables the system to evaluate its own driving decisions, identify accuracy gaps across scenarios, and direct targeted data collection rather than broad undirected improvement. Direct sensor-to-action architecture with no language models in the inference pipeline; runs on 1016 TOPS across three NVIDIA DRIVE Orin-X SoCs with redundant failover.

Unisound

UniGPT (Shanhai 山海) Closed

Text

—

60B+ parameter general large model (v5.0); underpins all Unisound vertical products. Medical, enterprise, and consumer deployments.

Unisound

U2-ASR 2.5 Closed

Audio

—

First LLM-based semantic ASR model for Chinese. Covers 100+ dialects across 7 dialect systems. >90% accuracy. Available via Token Hub API.

Unisound

U2-TTS / U2-TTS-Clone Closed

Audio

—

Text-to-speech and voice cloning with full-duplex millisecond response. Available via Token Hub API.

Unisound

U1-OCR Closed

Multimodal

—

Industrial-grade document intelligence model for OCR and document understanding. Launched February 2026.

Company ↕	Chip ↕	Role	Peak Compute ↕	Memory Tech	Capacity ↕	Bandwidth ↕	Interconnect Bandwidth ↕	TDP ↕	Process	Status	Highlights
Huawei	Ascend 910B	Training	256 TFLOPS BF16	HBM2e	64 GB	2.0 TB/s	392 GB/s HCCS 3.0	400 W	SMIC N+2	Production	Most-deployed domestic training chip. Used at Alibaba, ByteDance, Baidu, and Tencent.
Huawei	Ascend 910C	Training	800 TFLOPS FP16	HBM2e	96 GB	4.0 TB/s	800 GB/s HCCS	900 W	SMIC N+2	Production	Dual-die upgrade to 910B. Powers the Atlas 900 A3 SuperPoD (300 PFLOPS).
Huawei	Ascend 950	Training	1 PFLOPS FP8	—	128–144 GB	—	2.0 TB/s per chip	—	—	Planned	2026 target. Powers Atlas 950 SuperPoD (8 EFLOPS); targets Nvidia B200-class.
Huawei	Ascend 960	Training	2 PFLOPS FP8	—	—	—	—	—	—	Planned	2027 roadmap. Targets 2× Ascend 950 performance.
Alibaba (T-Head) ↗	Hanguang 800	Inference	820 TOPS INT8	—	—	512 GB/s	—	—	12nm	Production	Alibaba's first custom chip (2019). Inference-only for internal search and recommendation.
Alibaba (T-Head) ↗	PPU	Training / Inference	— —	HBM	96 GB	—	— PCIe 5.0	—	—	Production	Domestic H20 rival. ~16,000 units deployed at China Unicom (2025).
Alibaba (T-Head) ↗	Zhenwu M890	Training / Inference	— —	HBM3	144 GB	—	800 GB/s ICN Switch 1.0	—	—	Production	Alibaba's flagship (May 2026). Claims 3× H20 on agentic workloads.
Baidu (Kunlunxin) ↗	Kunlun 2	Training / Inference	128 TFLOPS FP16 / 256 TOPS INT8	—	—	—	—	~120 W	TSMC 7nm	Production	2nd-gen Kunlun (2021). Deployed at Baidu Cloud and third-party customers.
Baidu (Kunlunxin) ↗	Kunlun P800	Training	345 TFLOPS FP16	—	—	—	close to H20 reported	—	—	Production	Latest Kunlun (2025). 30,000-chip clusters handle DeepSeek-scale training.
Cambricon ↗	MLU590	Training / Inference	345 TFLOPS FP16 / FP8 support	HBM2e	64 GB	1.2 TB/s	—	300 W	TSMC 7nm	Production	Cambricon's flagship (2023) with FP8 support. Drove the company to its first profit.
Cambricon ↗	MLU690	Training	H100-class target	—	—	—	—	—	—	Planned	In development. Targets H100-class performance.
Biren Technology	BR100	Training	1,000 TOPS INT8	HBM2e	64 GB	2.0 TB/s	— PCIe 5.0	550 W	TSMC 7nm	Restricted	China's first GPU-class chip. TSMC access cut off by US export controls (2022).
Moore Threads ↗	MTT S4000	Training / Inference	100 TFLOPS FP16	GDDR6	48 GB	768 GB/s	— PCIe Gen5	450 W	TSMC 7nm	Production	Flagship data-centre GPU. MUSIFY layer provides CUDA compatibility.
Enflame ↗	Cloudragon T20	Training	128 TFLOPS FP16	HBM2e	64 GB	1.8 TB/s	— GCU-LARE	300 W	TSMC 7nm	Production	Tencent-backed training chip for domestic cloud deployments.
Iluvatar CoreX	TianGai 100	Training	147 TFLOPS FP16	HBM2	32 GB	—	— PCIe Gen4	300 W	TSMC 7nm	Production	Domestic inference accelerator deployed at Chinese hyperscalers.
Moore Threads ↗	MTT S80	Training / Inference	14.4 TFLOPS FP32	GDDR6	16 GB	448 GB/s	— PCIe Gen5	—	—	Production	Mid-range GPU. 4,096 MUSA cores.
Moore Threads ↗	MTT S70	Inference	11.2 TFLOPS FP32	GDDR6	7 GB	392 GB/s	— PCIe Gen4	—	—	Production	Entry-level GPU. 3,584 MUSA cores.
Enflame ↗	S60	Training / Inference	~128 TFLOPS FP16 est.	—	—	—	—	—	TSMC N6NTO-HPC	Production	3rd-gen accelerator. 70,000+ units shipped by mid-2025.
Enflame ↗	L600	Training / Inference	— FP8 support	—	144 GB	3.6 TB/s	~800 GB/s est.	—	TSMC N6NTO-HPC	Production	4th-gen flagship (July 2025). Targets NVIDIA H20-class performance.
Enflame ↗	i20	Inference	128 TFLOPS FP16/BF16; 256 TOPS INT8	HBM2e	16 GB	~819 GB/s	— PCIe	—	—	Production	Dedicated inference accelerator. 256 TOPS INT8.

Company ↕

Chip ↕

Role

Peak Compute ↕

Memory Tech

Capacity ↕

Bandwidth ↕

Interconnect
Bandwidth ↕

TDP ↕

Process

Status

Highlights

Huawei

Ascend 910B

Training

256 TFLOPS

BF16

HBM2e

64 GB

2.0 TB/s

392 GB/s

HCCS 3.0

400 W

SMIC N+2

Production

Most-deployed domestic training chip. Used at Alibaba, ByteDance, Baidu, and Tencent.

Huawei

Ascend 910C

Training

800 TFLOPS

FP16

HBM2e

96 GB

4.0 TB/s

800 GB/s

HCCS

900 W

SMIC N+2

Production

Dual-die upgrade to 910B. Powers the Atlas 900 A3 SuperPoD (300 PFLOPS).

Huawei

Ascend 950

Training

1 PFLOPS

FP8

—

128–144 GB

—

2.0 TB/s

per chip

—

Planned

2026 target. Powers Atlas 950 SuperPoD (8 EFLOPS); targets Nvidia B200-class.

Huawei

Ascend 960

Training

2 PFLOPS

FP8

—

Planned

2027 roadmap. Targets 2× Ascend 950 performance.

Alibaba (T-Head) ↗

Hanguang 800

Inference

820 TOPS

INT8

—

512 GB/s

—

12nm

Production

Alibaba's first custom chip (2019). Inference-only for internal search and recommendation.

Alibaba (T-Head) ↗

PPU

Training / Inference

—

HBM

96 GB

—

PCIe 5.0

—

Production

Domestic H20 rival. ~16,000 units deployed at China Unicom (2025).

Alibaba (T-Head) ↗

Zhenwu M890

Training / Inference

—

HBM3

144 GB

—

800 GB/s

ICN Switch 1.0

—

Production

Alibaba's flagship (May 2026). Claims 3× H20 on agentic workloads.

Baidu (Kunlunxin) ↗

Kunlun 2

Training / Inference

128 TFLOPS

FP16 / 256 TOPS INT8

—

~120 W

TSMC 7nm

Production

2nd-gen Kunlun (2021). Deployed at Baidu Cloud and third-party customers.

Baidu (Kunlunxin) ↗

Kunlun P800

Training

345 TFLOPS

FP16

—

close to H20

reported

—

Production

Latest Kunlun (2025). 30,000-chip clusters handle DeepSeek-scale training.

Cambricon ↗

MLU590

Training / Inference

345 TFLOPS

FP16 / FP8 support

HBM2e

64 GB

1.2 TB/s

—

300 W

TSMC 7nm

Production

Cambricon's flagship (2023) with FP8 support. Drove the company to its first profit.

Cambricon ↗

MLU690

Training

H100-class

target

—

Planned

In development. Targets H100-class performance.

Biren Technology

BR100

Training

1,000 TOPS

INT8

HBM2e

64 GB

2.0 TB/s

—

PCIe 5.0

550 W

TSMC 7nm

Restricted

China's first GPU-class chip. TSMC access cut off by US export controls (2022).

Moore Threads ↗

MTT S4000

Training / Inference

100 TFLOPS

FP16

GDDR6

48 GB

768 GB/s

—

PCIe Gen5

450 W

TSMC 7nm

Production

Flagship data-centre GPU. MUSIFY layer provides CUDA compatibility.

Enflame ↗

Cloudragon T20

Training

128 TFLOPS

FP16

HBM2e

64 GB

1.8 TB/s

—

GCU-LARE

300 W

TSMC 7nm

Production

Tencent-backed training chip for domestic cloud deployments.

Iluvatar CoreX

TianGai 100

Training

147 TFLOPS

FP16

HBM2

32 GB

—

PCIe Gen4

300 W

TSMC 7nm

Production

Domestic inference accelerator deployed at Chinese hyperscalers.

Moore Threads ↗

MTT S80

Training / Inference

14.4 TFLOPS

FP32

GDDR6

16 GB

448 GB/s

—

PCIe Gen5

—

Production

Mid-range GPU. 4,096 MUSA cores.

Moore Threads ↗

MTT S70

Inference

11.2 TFLOPS

FP32

GDDR6

7 GB

392 GB/s

—

PCIe Gen4

—

Production

Entry-level GPU. 3,584 MUSA cores.

Enflame ↗

S60

Training / Inference

~128 TFLOPS

FP16 est.

—

TSMC N6NTO-HPC

Production

3rd-gen accelerator. 70,000+ units shipped by mid-2025.

Enflame ↗

L600

Training / Inference

—

FP8 support

—

144 GB

3.6 TB/s

~800 GB/s

est.

—

TSMC N6NTO-HPC

Production

4th-gen flagship (July 2025). Targets NVIDIA H20-class performance.

Enflame ↗

i20

Inference

128 TFLOPS

FP16/BF16; 256 TOPS INT8

HBM2e

16 GB

~819 GB/s

—

PCIe

—

Production

Dedicated inference accelerator. 256 TOPS INT8.

Cluster / Supernode	Operator	Chip	Chip Count	Total Compute	Interconnect Fabric	Notes
Atlas 900 A3 SuperPoD (CloudMatrix 384)	Huawei Cloud	Ascend 910C	384	300 PFLOPS FP16	All-optical (OXC)	Huawei's current-generation supernode. Uses optical circuit switching (OXC) for low-latency inter-chip communication. Announced April 2025 as a full-stack domestic alternative to Nvidia GB200 NVL72.
Atlas 950 SuperPoD	Huawei Cloud	Ascend 950	8,192	8 EFLOPS FP8	16.3 PB/s aggregate	Next-generation supernode targeting 2026 delivery. 1,152 TB total memory. Full liquid cooling. Footprint larger than two basketball courts. Would represent a major leap in domestic training capacity if delivered at spec.
Atlas 960 SuperPoD	Huawei Cloud	Ascend 960	15,488	— 2027 roadmap	—	Huawei's 2027 roadmap cluster. No confirmed specs — scale figure from Huawei roadmap slides.
Kunlun Training Cluster	Baidu / Kunlunxin ↗	Kunlun P800	30,000	DeepSeek-scale hundreds of B params	—	Baidu's 30,000-chip P800 cluster reported capable of training models with hundreds of billions of parameters — comparable to DeepSeek-scale workloads. Deployed 2025.
Panjiu Supernode (AL128)	Alibaba Cloud ↗	Zhenwu M890	128 / rack	— —	800 GB/s ICN Switch 1.0	Alibaba's modular supernode architecture. 128 Zhenwu M890 chips per rack unit, fully liquid cooled. Announced May 2026 alongside the M890.

Cluster / Supernode

Operator

Chip

Chip Count

Total Compute

Interconnect Fabric

Notes

Atlas 900 A3 SuperPoD (CloudMatrix 384)

Huawei Cloud

Ascend 910C

384

300 PFLOPS

FP16

All-optical (OXC)

Huawei's current-generation supernode. Uses optical circuit switching (OXC) for low-latency inter-chip communication. Announced April 2025 as a full-stack domestic alternative to Nvidia GB200 NVL72.

Atlas 950 SuperPoD

Huawei Cloud

Ascend 950

8,192

8 EFLOPS

FP8

16.3 PB/s aggregate

Next-generation supernode targeting 2026 delivery. 1,152 TB total memory. Full liquid cooling. Footprint larger than two basketball courts. Would represent a major leap in domestic training capacity if delivered at spec.

Atlas 960 SuperPoD

Huawei Cloud

Ascend 960

15,488

—

2027 roadmap

—

Huawei's 2027 roadmap cluster. No confirmed specs — scale figure from Huawei roadmap slides.

Kunlun Training Cluster

Baidu / Kunlunxin ↗

Kunlun P800

30,000

DeepSeek-scale

hundreds of B params

—

Baidu's 30,000-chip P800 cluster reported capable of training models with hundreds of billions of parameters — comparable to DeepSeek-scale workloads. Deployed 2025.

Panjiu Supernode (AL128)

Alibaba Cloud ↗

Zhenwu M890

128 / rack

—

800 GB/s ICN Switch 1.0

Alibaba's modular supernode architecture. 128 Zhenwu M890 chips per rack unit, fully liquid cooled. Announced May 2026 alongside the M890.

The essential tracker for
China's AI ecosystem

China's Chip Ecosystem

The essential tracker forChina's AI ecosystem

China's Chip Ecosystem

The essential tracker for
China's AI ecosystem